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ABSTRACT 

This article describes the implementation in the software 
package NumGfun of classical algorithms that operate on so- 
lutions of linear differential equations or recurrence relations 
with polynomial coefficients, including what seems to be the 
first general implementation of the fast high-precision nu- 
merical evaluation algorithms of Chudnovsky & Chudnovsky. 
In some cases, our descriptions contain improvements over 
existing algorithms. We also provide references to relevant 
ideas not currently used in NumGfun. 

Categories and Subject Descriptors: 1. 1.2 [Symbolic 
and Algebraic Manipulation]: Algorithms 

General Terms: Algorithms, Experimentation, Theory 

Keywords: D-finite functions, linear differential equations, 
certified numerical computation, bounds, Maple 

1. INTRODUCTION 

Support for computing with D-finite functions, that is, 
solutions of linear differential equations with polynomial co- 
efficients, has become a common feature of computer alge- 
bra systems. For instance, Mathematica now provides a 
data structure called Dif f erentialRoot to represent arbi- 
trary D-finite functions by differential equations they satisfy 
and initial values. Maple's DESol is similar but more limited. 

An important source of such general D-finite functions 
is combinatorics, due to the fact that many combinatorial 
structures have D-finite generating functions. Moreover, 
powerful methods allow to get from a combinatorial de- 
scription of a class of objects to a system of differential 
equations that "count" these objects, and then to extract 
precise asymptotic information from these equations, even 
when no explicit counting formula is available [151 I30j . A 
second major application of D-finiteness is concerned with 
special functions. Indeed, many classical functions of mathe- 
matical physics are D-finite (often by virtue of being defined 
as "interesting" solutions of simple differential equations), 
which allows to treat them uniformly in algorithms. This is 
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exploited by the Encyclopedia of Special Functions [2S] and 
its successor under development, the Dynamic Dictionary 
of Mathematical Functions [20], an interactive computer- 
generated handbook of special functions. 

These applications require at some point the ability to per- 
form "analytic" computations with D-finite functions, start- 
ing with their numerical evaluation. Relevant algorithms ex- 
ist in the literature. In particular, D-finite functions may 
be computed with an absolute error bounded by 2~" in 
nlog'^'^^n bit operations — that is, in softly linear time in 
the size of the result written in fixed-point notation — at 
any point of their Riemann surfaces _12^, the necessary er- 
ror bounds also being computed from the differential equa- 
tion and initial values [32]. However, these algorithms have 
remained theoretical [131 §9.2.1]. The ability of computer 
algebra systems to work with D-finite functions is (mostly) 
limited to symbolic manipulations, and the above-mentioned 
fast evaluation algorithm has served as a recipe to write nu- 
merical evaluation routines for specific functions rather than 
as an algorithm for the entire class of D-finite functions. 

This article introduces NumGfun, a Maple package that 
attempts to fill this gap, and contains, among other things, 
a general implementation of that algorithm. NumGfun is 
distributed as a subpackage of gfun [2H], under the GNU 
LGPL. Note that it comes with help pages: the goal of the 
present article is not to take the place of user documentation, 
but rather to describe the features and implementation of 
the package, with supporting examples, while providing an 
overview of techniques relevant to the development of simi- 
lar software. The following examples illustrate typical uses 
of NumGfun, first to compute a remote term from a com- 
binatorial sequence, then to evaluate a special functioir to 
high precision near one of its singularities. 

Example 1. The Motzkin number A/„ is the number of 
ways of drawing non-intersecting chords between n points 
placed on a circle. Motzkin numbers satisfy (n + 4)M„+2 = 
3(n + l)Afir, -f (2n-f 5)M„+i. Using NumGfun, the command 
nth_term({ (n+4) *M(n+2) = 3* (n+1) *M(n) + (2*n+5) *M(n+l) , 
M(0)=l,M(l)=l},M(n) .it) computes Miqs = 6187 . . . 7713 ~ 
^047 706 ;j^4 7 g j^^^^ ^ 2635 ... 9151 ~ 10477112 in 

1 min. Naively unrolling the recurrence (using Maple) takes 
10.7 s for Miq5, and 41 s for M2.105. On this (non-generic) 
example, nth_term could be made competitive for smaller 
indices by taking advantage of the fact that the divisions 
that occur while unrolling the recurrence are exact. 

^AU timings reported in this article were obtained with the 
following configuration: Intel T7250 CPU at 2 GHz, 1 GB 
of RAM, Linux 2.6.32, Maple 13, GMP 4.2.1. 



Example 2. The double confluent Heun function Ua,i3,f,6 
satisfies {z'^-lfU"{z) + {2z'' ~ 4z^ - az'^ + 2z + a)U'\z) + 
{I3z^ + {2a + -f)z + 5)U{z) = 0, (7(0) = 1, U'{0) = 0. It is 
singular at 2 = ±1. The command evaldif f eq(eg,y(z) , 
-0.99,1000) where eq is this differential equation yields 
t/i_i 1 3(-0.99) ^ 4.67755.. .(990 digits)... 05725 in 22 s. 

Related work. Most algorithms implemented in NumGfun 
originate in work of Chudnovsky & Chudnovsky and of van 
der Hoeven. Perhaps the most central of these is the "bit 
burst" numerical evaluation method [T^]. It belongs to the 
family of binary splitting algorithms for D-finite series, hints 
at the existence of which go back to §178], and general- 
izes earlier work of Brent 6 for specific elementary functions. 
Better known (thanks among others to |21j) binary splitting 
algorithms can be seen as special cases of the bit burst al- 
gorithm. One reason why, unlike these special cases, it was 
not used in practice is that in [T2] , none of the error control 
needed to ensure the accuracy of the computed result is part 
of the algorithm. Van der Hoeven's version "35^ addresses 
this issue, thus giving a full-fledged evaluation algorithm for 
the class of D-flnite functions, as opposed to a method to 
compute any D-flnite function given certain bounds. 

These algorithms extend to the computation of limits of 
D-flnite functions at singularities of their deflning equation. 
The case of regular singularities is treated both in [111 [12] , 
and more completely in [53], that of irregular singular points 
in [5S]. See [3] §12.7], [351 §1] for more history and context. 

On the implementation side, routines based on binary 
splitting for the evaluation of various elementary and spe- 
cial functions are used in general-purpose libraries such as 
CLN 19^ and MPFR [11123]. Binary splitting of fast con- 
verging series is also the preferred algorithm of software ded- 
icated to the high-precision computation of mathematical 
constants on standard PCs, including the current record 
holder for n [5] . Finally, even the impressive range of built-in 
functions of computer algebra systems is not always sufficient 
for applications. Works on the implementation of classes of 
"less common" special functions that overlap those consid- 
ered in NumGfun include [T] [TJ . 

This work is based in part on the earlier [26[ . 

Contribution. The main contribution presented in this arti- 
cle is NumGfun itself. We recall the algorithms it uses, and 
discuss various implementation issues. Some of these de- 
scriptions include improvements or details that do not seem 
to have appeared elsewhere. Speciflcally: (i) we give a new 
variant of the analytic continuation algorithm for D-flnite 
functions that is faster with respect to the order and degree 
of the equation; (ii) we improve the complexity analysis of 
the bit burst algorithm by a factor log log n; (iii) we point 
out that Poole's method to construct solutions of differen- 
tial equations at regular singular points can be rephrased 
in a compact way in the language of noncommutative poly- 
nomials, leading to faster evaluation of D-flnite functions in 
these points; and (iv) we describe in some detail the practical 
computation of all the bounds needed to obtain a provably 
correct result. 

What NumGfun is not. Despite sharing some of the algo- 
rithms used to compute mathematical constants to billions 
of digits, our code aims to cover as much as possible of the 
class of D-flnite functions, not to break records. Also, it 



is limited to "convergent" methods: asymptotic expansions, 
summation to the least term, and resummation of divergent 
power series are currently out of scope. 

Terminology. Like the rest of gfun, NumGfun works with 
D-finite functions and P-recursive sequences. We recall only 
basic deflnitions here; see [301 §6.4] for further properties. A 
formal power series y £ CHz]] is D-finite over .ft' C C if it 
solves a non-trivial linear differential equation 

yW(2) + ar-iiz) y^^-'\z) + ■■■+ a,{z) y{z) = (1) 

with coefficients ak G K{z). The same definition applies to 
analytic functions. A sequence u £ C"^ is P-recursive over K 
if it satisfies a non-trivial linear recurrence relation 

Un+s + bs-i{n)un+s-i-\ \-bo{n)un = 0, bkeK{n). (2) 

A sequence (un)neiN is P-recursive if and only if its generat- 
ing series X^neN*^"'^" D-finite. 

The poles of the coefficients au of ([l]) are its singular 
points; nonsingular points are called ordinary. In gfun, a 
D-finite function is represented by a differential equation 
of the form IT} and initial values at the origin, which we 
assume to be an ordinary point. Similarly, P-recursive se- 
quences are encoded by a recurrence relation plus initial 
values, as in Ex. [T] above. If y{z) = X/fc^o 2^'=^'°' ^® 

The height of an object is the maximum bit-size of the 
integers appearing in its representation: the height of a 
rational number p/q is max([logp], [logg]), and that of a 
complex number (we assume that elements of number fields 
(Q((^) are represented as X/i^'CV^ with Xi,d £ Z), poly- 
nomial, matrix, or combination thereof with rational coeffi- 
cients is the maximum height of its coefficients. We assume 
that the bit complexity A/(n) of n-bit integer multiplica- 
tion satisfies M{n) — n{logn)'-"'^\ M{n) — Q{nlogn), and 
M{n -\- m) > M{n) + AI{m), and that s x s matrices can be 
multiplied in 0(s"') operations in their coefficient ring. 

2. BINARY SPLITTING 

"Unrolling" a recurrence relation of the form ^ to com- 
pute Mo, . . . , UN takes 0(A'^^Af (log A'^)) bit operations, which 
is almost linear in the total size of tto, . . . , ujv, but quadratic 
in that of un- The binary splitting algorithm computes a 
single term itjv in essentially linear time, as follows: ^ is 
first reduced to a matrix recurrence of the first order with a 
single common denominator: 

q{n)Un+i = B{n)Un, B(n) £ Z[n]'''", g(n) £ Z[n], (3) 

so that Un = P{0, N) Uo/ {U'^Jg^ q{i)) , where P{j,i) = 
B{j ~l)---B{i-\- l)B{i). One'then computes P(0, A^) re- 
cursively as P{0,N) = P(LA/2J , A)P(0, [A/2J), and the 
denominator J^^q^ q{i) in a similar fashion (but separately, 
in order to avoid expensive gcd computations). 

The idea of using product trees to make the most of fast 
multiplication dates back at least to the seventies [U §12.7]. 
The general statement below is from [121 Theorem 2.2], ex- 
cept that the authors seem to have overlooked the cost of 
evaluating the polynomials at the leaves of the tree. 

Theorem 1 (Chudnovsky, Chudnovsky). Let u he 
a P-recursive sequence over (Q(i), defined by Assume 
that the coefficients bk{n) of have no poles in N. Let d, h 



denote bounds on their degrees ( of numerators and denomi- 
nators) and heights, and d',h' corresponding bounds for the 
coefficients of B{n) and q{n) in Q. Assuming N ^ s,d, 
the binary splitting algorithm outlined above computes one 
term un of u in 0{s'^M{N{h' + d' log iV)) log(iVft')), that 
is, 0{s'^M{sdN{h + log N))\og{Nh)), bit operations. 

Proof sketch. Write H = h' + d'logN. Computing 
the product tree P{0,N) takes 0{s'^ M{NH) log N) bit op- 
erations |12l §2] (see also Prop. [T] below), and the evalua- 
tion of each leaf B{i) may be done in 0{M{H) log d') opera- 
tions [5] §3.3]. This gives itjv as a fraction that is simplified 
in 0{M{NH)log{NH)) operations 8, §1.6]. 

Now consider how is rewritten into With co- 

efficients in Z[i] rather than (Q(i), the bk{n) have height 
h" < l)h. To get B{n) and q{n), it remains to reduce 
to common denominator the whole equation; hence d' < sd 
and h' < s(/i" -|- log s -I- log d). These two conversion steps 
take 0{M{sdhlog^ d)) and 0{M{d'h' logs)) operations re- 
spectively, using product trees. The assumption d,s = o{N) 
allows to write H = 0{sd{h + log A'')) and get rid of some 
terms, so that the total complexity simplifies as stated. □ 

Since the height of ujv may be as large as Q{{N + h) logN), 
this result is optimal with respect to h and A'^, up to loga- 
rithmic factors. The same algorithm works over any alge- 
braic number field instead of (Q(i). This is useful for eval- 
uating D-finite functions "at singularities" (|4|. More gen- 
erally, similar complexity results hold for product tree com- 
putations in torsion-free Z-algebras (or (Q-algebras: we then 
write A = <Q (g)a A' for some Z-algebra A' and multiply in 
Z X A'), keeping in mind that, without basis choice, the 
height of an element is defined only up to some additive 
constant. 

Constant-factor improvements. Several techniques per- 
mit to improve the constant hidden in the O(-) of Theorem[T] 
by making the computation at each node of the product tree 
less expensive. We consider two models of computation. 

In the FFT model, we assume that the complexity of long 
multiplication decomposes as M{n) = 3F{2n) + 0{n), where 
F{n) is the cost of a discrete Fourier transform of size n (or 
of another related linear transform, depending on the algo- 
rithm). FFT- based integer multiplication algorithms adapt 
to reduce the multiplication of two matrices of height n in 
Z^'^" to 0(n) multiplications of matrices of height 0{1), for 
a total of 0{s^ M{n) + s'^n) bit operations. This is known as 
"FFT addition" g], "computing in the FFT mode" [31], or 
"FFT invariance". A second improvement ("FFT doubling", 
attributed to R. Kramer in |4, §12.8]) is specific to the com- 
putation of product trees. The observation is that, at an 
internal node where operands of size n get multiplied using 
three FFTs of size 2n, every second coefficient of the two 
direct DFTs is already known from the level below. 

The second model is black-box multiplication. There, we 
may use fast multiplication formulae that trade large in- 
teger multiplications for additions and multiplications by 
constants. The most obvious example is that the prod- 
ucts in (Q(i) may be done in four integer multiplications 
using Karatsuba's formula instead of five with the naive al- 
gorithm. In general, elements of height h of an algebraic 
number field of degree d may be multiplied in 2dA4{h) + 0{h) 
bit operations using the Toom-Cook algorithm. The same 
idea applies to the matrix multiplications. Most classical 



matrix multiplication formulae such as Strassen's are so- 
called bilinear algorithms. Since we are working over a 
commutative ring, we may use more general quadratic al- 
gorithms ^ §14.1]. In particular, for all s, Waksman's al- 
gorithm ^8] multiplies s x s matrices over a commutative 
ring R using s^[s/2] + (2s — l)[s/2j multiplications in R, 
and Makarov's i24, multiplies 3x3 matrices in 22 scalar 
multiplications. These formulas alone or as the base case of 
a Strassen scheme achieve what seems to be the best known 
multiplication count for matrices of size up to 20. 

Exploiting these ideas leads to the following refinement 
of Theorem [T] Similar results can be stated for general Z- 
algebras, using their rank and multiplicative complexity [9]. 

Proposition 1. Let d' and h' denote bounds on the de- 
grees and heights (respectively) of B{n) and q{n) in Eq. {JP- 
As N,h' ^ oo (s and d' being fixed), the number of bit op- 
erations needed to compute the product tree P{0, N) is at 
most (C + o{l))M{N{h' -I- d' log A^) log(Ar/i')), C = 
(2s^ + l)/6 in the FFT model, and C = (3MM(s) + l)/4 
in the black-box model. Here MM(s) < (s'^ + 3s^)/2 is the 
algebraic complexity of s X s matrix multiplication over TL. 

Proof. Each node at the level k (level being the root) 
of the tree essentially requires multiplying sxs matrices with 
entries in TL\i\ of height Hu = A'(/i' + d' log A')/2'''+\ plus 
denominators of the same height. In the FFT model, this 
may be done in (2s^-|-l)M(ii"fc) operations. Since we assume 

M(n) = n(nlogn), we have X)fc=o^^ 2''M{Hk) < \M{Ho) 
(a remark attributed to D. Stehle in [39]). Kramer's trick 
saves another factor |. In the black-box model, the cor- 
responding cost for one node is (3MM(s) -I- l)M(Hk) with 
Karatsuba's formula. Stehle's argument applies again. □ 



Note that the previous techniques save time only for dense 
objects. In particular, one should not use the "fast" matrix 
multiplication formulae in the few bottom levels of product 
trees associated to recurrences of the form ([HJ, since the 
matrices at the leaves are companion. 

Continuing on this remark, these matrices often have some 
structure that is preserved by successive multiplications. For 
instance, let s„ = X/k-o where (u„) satisfies It is 
easy to compute a recurrence and initial conditions for (sn) 
and go on as above. However, unrolling the recurrences ^ 
and s„+i — Sn = Un simultaneously as 



Un+s-l 
Un-\-s 
Sn + 1 
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Un+s-2 
Sn 



(4) 



is more efficient. Indeed, all matrices in the product tree 
for the numerator of Q then have a rightmost column of 
zeros, except for the value in the lower right corner, which 
is precisely the denominator. With the notation MM(s) of 
Proposition [T] each product of these special matrices uses 
MM(s) + s'^ + s + l muhiplications, vs. MM(s -I- 1) -1- 1 for 
the dense variant. Hence the formula Q is more efficient as 
soon as MM(s -|- 1) - MM(s) > s(s -I- 1), which is true both 
for the naive multiplication algorithm and for Waksman's 
algorithm (compare [39]). In practice, on Ex. [3] below, if 
one puts Un = s„+i — s„ in ([U [3} instead of using @, the 
computation time grows from 1.7 s to 2.7 s. 



The same idea applies to any recurrence operator that 
can be factored. Further examples of structure in product 
trees include even and odd D-finite series {e.g., [SI §4.9.1]). 
In all these cases, the naive matrix multiplication algorithm 
automatically benefits from the special shape of the problem 
(because multiplications by constants have negligible cost), 
while fast methods must take it into account explicitly. 

Remark 1. A weakness of binary splitting is its compara- 
tively large space complexity Q.{n\ogn). Techniques to re- 
duce it are known and used by efficient implementations in 
the case of recurrences of the first order [101 1171 1191 [3] . 

Implementation. The implementation of binary splitting in 
NumGfun includes some of the tricks discussed in this sec- 
tion. FFT-based techniques are currently not used because 
they are not suited to implementation in the Maple language. 
This implementation is exposed through two user-level func- 
tions, nth_term and fnth_term, that allow to evaluate P- 
recursive sequences (fnth_term replaces the final gcd by a 
less expensive division and returns a floating-point result). 
Additionally, gf un [rectoproc] , which takes as input a re- 
currence and outputs a procedure that evaluates its solution, 
automatically calls the binary splitting code when relevant. 
Examples [T] and [3] illustrate the use of these functions on 
integer sequences and convergent series respectively. 

Example 3. [3 §6.10] Repeated integration by parts of the 
integral representation of F yields for < Re(2) < 1 

Viz) = > — — + / e^'^u^^^Au. 

Taking t = 29^^, it follows that the sum y^^^°° tt„, where 
uo = 87e"' and (3n + 4)u„+i = Ston, is within 10-100°° of 
r(l/3), whence r(l/3) ~ 2.67893 . . . (9990 digits) . . . 99978. 
This computation takes 1.7 s. 

3. HIGH-PRECISION EVALUATION OF 
D-FINITE FUNCTIONS 

We now recall the numerical evaluation algorithms used 
in NumGfun, and discuss their implementation. 

Let y{z) — Vnz" be a D-finite series with radius of 
convergence p at the origin. Let 2 G (Q(j) be such that \z\ < 
p and height(2) < h. The sequence {y„z") is P-recursive, so 
that the binary splitting algorithm yields a fast method for 
the high-precision evaluation of y{z). Here "high-precision" 
means that we let the precision required for the result go to 
infinity in the complexity analysis of the algorithm. 

More precisely, (ynz") is canceled by a recurrence relation 
of height 0{h). By Theorem[T] y{z) may hence be computed 
to the precision lO^'' in 

0{M{N{h + log N))log{Nh)) (5) 

bit operations, where is chosen so that |j/jv;(z)| < 10"'', 
i.e. N ~ p/log(p/|2|) if p < 00, and A'" ~ p/(rlogp) for 
some T if p = 00. 

In practice, the numbers of digits of (absolute) precision 
targeted in NumGfun range from the hundreds to the mil- 
lions. Accuracies of this order of magnitude are useful in 
some applications to number theory [I^, and in heuristic 
equality tests |33l §5]. It can also happen that the easiest 



way to obtain a moderate-precision output involves high- 
precision intermediate computations, especially when the 
correctness of the result relies on pessimistic bounds. 

Analytic continuation. Solutions of the differential equa- 
tion lU defined in the neighborhood of extend by analytic 
continuation to the universal covering of <D\S, where S is the 
(finite) set of singularities of the equation. D-finite functions 
may be evaluated fast at any point by a numerical version of 
the analytic continuation process that builds on the previous 
algorithm J^ . Rewrite U]) in matrix form 

Y'(z)=A{z)Y{z), Y{z)=(y^) . (6) 

This choice of Y{z) induces, for all zq £ C \ 5", that of a 
family of canonical solutions of ^ defined by 

y[zo,j]{z) = (2 - zoY + 0((2 - 20)''), < i < r, 

that form a basis of the solutions of ([T]) in the neighborhood 
of 20. Stated otherwise, the vector y[2o] = {y[zo, j])o<j<r of 
canonical solutions at 20 is the first row of the fundamental 
matrix y[2o](2) of ® such that ^[20] (20) — Id. 

By linearity, for any path 20 — > 21 in C \ S, there exists a 
transition matrix Mzg^^i that "transports initial conditions" 
(and canonical solutions) along the path: 

Y{zi)^M,„^,,Y{zo), y[zo]iz) ^y[zi](z)M,„^,,. (7) 

This matrix is easy to write out explicitly: 

M,„_,, = y[20](2l) = (iy[20,j]«(2l)) , (8) 

\i! / 0<i,j<r 

evaluations at 21 being understood to refer to the analytic 
continuation path 20 — >■ 21. Transition matrices compose: 

Mzg^zi^Z2 = Mz-i-^z^Mza^zi, Mzi^zo ~ Mzg^zi- (9) 

NumGfun provides functions to compute for a given 
path 7 (transition_matrix), and to evaluate the analytic 
continuation of a D-finite function along a path starting at 
(evaldif f eq). In both cases, the path provided by the user 
is first subdivided into a new path 20 — >■ 21 — > ■ ■ ■ — )■ 2m, 
Zi £ Q(i), such that, for all £, the point 2^+1 lies within 
the disk of convergence of the Taylor expansions at ze of all 
solutions of IJ}. Using ((5)), approximations Mt £ <5^{iy^^ of 
Mzg^zi^i are computed by binary splitting. 

More precisely, we compute all entries of Me at once, as 
follows. For a generic solution y of ((T|, the coefficients u„ of 
u{z) = y{ze + z) = X^^q Unz" are canceled by a recurrence 
of order s. Hence the partial sums u-„{z) of u satisfy 

{ f/„+i A _ {C{n)z 0\{ [/„ \ 
[u.,„+^{z)J K 1) [u.,„{z)J' 

Bin) 

where K ^ (1,0, ...,0) and C(n) e (Q(n)"'' = . Introducing 
an indeterminate 5, we let 2 = ze+i — ze + S £ Q(4)[5] and 
compute B{N — 1) . . . B{0) mod by binary splitting (an 
idea already used in [32]), for some suitable N. The upper 
left blocks of the subproducts are kept factored as a power 
of 2 times a matrix independent on 2. In other words, clear- 
ing denominators, the computation of each subproduct 

-P = i ( J ) = fhigh-Piow (p = numer(2)'") 



is performed as D i~- Dhigh-Diow ; P PhighPiow ; R ^ 
Piow(-RhighCiow) + rfhigh-Riow ; d ^ dhighrfiow The powers 
of z can be computed faster, but the saving is neghgible. 
Applying the row submatrix R of the full product to the 
matrix Uo = ijiy[ze, j]^''Hze))o^i<s,o^j<r yields a row vector 
equal to {y[ze,j]-N{zi+i + 5))o^j<r + 0{5^), each entry of 
which is a truncated power series whose coefficients are the 
entries of the corresponding column of Mi. This way of 
computing Mi is roughly min(r, s'^"'^) times more efficient 
than the fastest of the variants from |12l 132) . 

In the function transition_matrix, we then form the 
product Mm-i ■ ■ ■ Mo, again by binary splitting. In the case 
of evaldiffeq, we compute only the first row R of Mm-i, 
and we form the product RMm-2 ■ ■ ■ MqI, where / is an 
approximation with coefficients in (Q(i) of the vector Y{zo) 
of initial conditions (or this vector itself, if it has symbolic 
entries). The whole computation is done on unsimplified 
fractions, controlling all errors to guarantee the final result. 
Let us stress that no numerical instability occurs since all nu- 
merical operations are performed on rational numbers. We 
postpone the discussion of approximation errors (including 
the choice of N) to H 

Example 4- Transition matrices corresponding to paths 
that "go round" exactly one singularity once are known as 
local monodromy matrices. A simple example is that of the 
equation (1 + z^)y" + 2zy' = 0, whose solution space is gen- 
erated by 1 and arctanz. Rather unsurprisingly, around i: 

> transition_matrix( (l+z~2) *dif f (y(z) ,z,y(z) ,z) 
+2*z*diff (y(z) ,z) , y(z) , [0,1+1,2*1,1-1,0], 20); 



1 .00000000000000000000 




3. 14159265358979323846 \ 
1.00000000000000000000 



More generally, expressing them as entries of monodromy 
matrices is a way to compute many mathematical constants. 

Another use of analytic continuation appears in Ex. [2l 
there, despite the evaluation point lying within the disk of 
convergence, NumGfun performs analytic continuation along 
the path — )• ^ — >■ ^ — ^ — >■ — >■ to approach 
the singularity more easily (c/. [121 Prop. 3.3]). 

The "bit burst" algorithm. The complexity result ^ is 
quasi-optimal for h = O(logp), but becomes quadratic in p 
for h — 0{p), which is the size of the approximation 2 £ (Q(i) 
needed to compute y{z) for arbitrary 2 G C. This issue can 
be avoided using analytic continuation to approach z along 
a path zo ~^ zi ^ ■ ■ ■ ~^ Zm = z made of approximations 
of z whose precision grow geometrically: 



\ze+i ~ zt\ <2 



height(z^) = 0(2^ 



(11) 



thus balancing h and \ z\. This idea is due to Chudnovsky and 
Chudnovsky [T^], who called it the bit burst algorithm, and 
independently to van der Hoeven with a tighter complexity 
analysis [321 I37| . The following proposition improves this 
analysis by a factor log log p in the case where y has a finite 
radius of convergence. No similar improvement seems to 
apply to entire functions. 

Proposition 2. Let y be a D-finite function. One may 
com,pute a -approximation of y at a point z G Q(i) of 
height 0{p) in 0{M{plog^ p)) bit operations. 



Proof. By © and (fTTI) . the step ze -> ze+i of the bit- 
burst algorithm takes 0(M(p(2* -|- logp) logp/2^)) bit oper- 
ations. Now 



1^0 



logP ^ M (^mnlogp + 



Plog P 



1=0 



and m = 0{logp), hence the total complexity. □ 

Example 5. Consider the D-finite function y defined by 
the following equation, picked at random: 



.5 1 19 

z H ; 

vl2 4 24 



5 

24'' 



!\ (1) I / I ^ 1 -'-^ 2 1 ... 

+V 24 ~^ 



— - —Z+-Z'' 

12 24 8 



24 3 
3 



24 



■ H : 

4 12 



12 

5 2 1 3\ / 
-z H z ) y 

6 2 / 



/ 5 23 7 2 1 3\ 

y(o) = ^,y'(o) = ^,."(o) = ^,y"'(o) = ^. 

The singular points are z ~ 3.62 and z ~ 0.09 ± 0.73i. By 
analytic continuation (along a path — > vri homotopic to a 
segment) followed by bit-burst evaluation, we obtain 

y{Tvi) -0.52299. ..(990 digits)... 53279 - 1. 50272. ..90608i 

after about 5 min. This example is among the "most gen- 
eral" that NumGfun can handle. Intermediate computations 
involve recurrences of order 8 and degree 17. 

4. REGULAR SINGULAR POINTS 

The algorithms of the previous section extend to the case 
where the analytic continuation path passes through regu- 
lar singular points of the differential equation [111 1121 I33| . 
Work is in progress to support this in NumGfun, with two 
main applications in view, namely special functions (such 
as Bessel functions) defined using their local behaviour at a 
regular singular point, and "connection coefficients" arising 
in analytic combinatorics IT5l, § VII.9]. We limit ourselves to 
a sketchy (albeit technical) discussion. 

Recall that a finite singular point zq of a linear differential 
equation with analytic coefficients is called regular when all 
solutions y{z) have moderate growth y{z) = l/{z — zo)'^^^^ as 
z — Zq in a sector with vertex at zq, or equivalently when the 
equation satisfies a formal property called Fuchs' criterion. 
The solutions in the neighborhood of zo then have a simple 
structure: for some t G N, there exist linearly independent 
formal solutions of the form (with zq = 0) 



y{z) 



Mz)log''z, AgC, (Pke€[[z]] (12) 



in number equal to the order r of the differential equation. 
Additionally, the (/f>fe converge on the open disk centered at 
extending to the nearest singular point, so that the solu- 
tions ([12[) in fact form a basis of analytic solutions on any 
open sector with vertex at the origin contained in this disk. 
(See for instance 28 for proofs and references.) 

Several extensions of the method of indeterminate coef- 
ficients used to obtain power series solutions at ordinary 
points allow to determine the coefficients of the series (f>k. 
We proceed to revisit Poole's variant [281 §16] of Heffter's 
method [221 Kap. 8] using the "operator algebra point of 
view" on indeterminate coefficients: a recursive formula for 
the coefficients of the series expansion of a solution is ob- 
tained by applying simple rewriting rules to the equation. 



This formulation makes the algorithm simpler for our pur- 
poses (compare [3T1[TT][33]) and leads to a small complexity 
improvement. It also proves helpful in error control (§6). 

Since our interest lies in the D-finite case, we assume that 
is a regular singular point of Equation Letting 9 — 

zj^, the equation rewrites as L{z,6) ■ y = where L is 
a polynomial in two noncommutative indeterminates (and 
1/(2, 9) has no nontrivial left divisor in -/^[z]). Based on H12p . 
we seek solutions as formal logarithmic sums 

Let us call the double sequence y = {yn,k)ne\+z.k£¥i the 
coefficient sequence of y. The shift operators Sn and 5**; 
on such double sequences are defined by Sn ■ y ~ (j/n+i,fc), 
and Sk • y = {Vn.k+i)- The heart of the method lies in the 
following observation. 

Proposition 3. Let y{z) be a formal logarithmic sum. 
The relation L{z,9) ■ y — holds (formally) iff the coeffi- 
cient sequence y of y satisfies L{S^^,n + Sk) ■ y — 0. 

Proof. The operators z and 9 act on logarithmic sums 
by z ■ y{z) = E„eA+sEfc>oy"-i.'=^^" and 9 ■ y{z) = 

J2ne\+z Z]fe>o("2/".''- + 2/".'=+i)^^lr^^"- ^^^^ coefficient 
sequence of L{z, 9) ■ y is L{S^^ ,n + Sk) ■ y- □ 

Assume that y{z) satisfies (U]). Then L{S^^ , n + Sk)-y = 0; 
additionally, (|12[1 implies that yn,k = for k > t, which 
translates into Sl ■ y ^ 0. Set L{z,9) = Qo{9) + R{z,9)z, 
and fix n £ A + Z. If Qo{n) 7^ 0, then Qo{n + X) is invertible 
in K{\)[[X]], and 

y = -{Qo{n + Sk)-' mod Si) R{S-\n + Sk)S-' ■ y. 

In general, let /i(n) be the multiplicity of n as a root of Qo- 
Take T„ £ K{\)[X] such that T„(X)(X~"(")(3o(n + X)) = 

1 + 0(X*) (explicitly, TniSk) = Y^Hli^^j) ^^JD- 
Then, it holds that 

5M(n) . y ^ -Tn{Sk)R{S-\n + Sk)S-' ■ y, (13) 

hence the y„'^k with n < n determine (j/n,fe)fc>n(7i) , while 
the yn,k, k < nin) remain free. 

Coupled with the condition = for A < following 
from (|12p . this implies that a solution y is characterized by 
{yn,k)(n,k)eE, where E = {{n,k) \ k < fi{n)}. As in ^ this 
choice of initial values induces that of canonical solutions 
(at 0) y[n,k] defined by y[n,k]n.k ~ 1 and y[n,k]„i^y = 
for {n',k') £ E \ {{n,k)}. The notion of transition matrix 
extends, see 133'. §4] for a detailed discussion. 

Equation H13|) is a "recurrence relation" that lends itself to 
binary splitting. The main difference with the setting of ^is 
that the companion matrix of the "recurrence operator" now 
contains truncated power series, i.e., B G K{n)[[Sk]]/{Sl). 
The coefficients y[u,v]n = j/[it, u]„,fc log'' z of canoni- 

cal solutions may be computed fast by forming the product 
B(n - 1) ■ ■ ■ B{u) G K(X)[[Sk]]/{Si) and applying it to the 
initial values Yu = (0, . . . , 0, log" z)"^ . Compared to the al- 
gorithm of [33], multiplications of power series truncated to 
the order t replace multiplications of t x t submatrices, so 
that our method is slightly faster. As in ^ all entries of 
Mzo_>zi may (and should) be computed at once. 



5. ERROR CONTROL 

Performing numerical analytic continuation rigorously re- 
quires a number of bounds that control the behaviour of the 
function under consideration. We now describe how error 
control is done in NumGfun. Some of the ideas used in this 
section appear scattered in |32)-|36). NumGfun relies almost 
exclusively on a priori bounds; see [271 §5.2] for pointers to 
alternative approaches, and [3^ for further useful techniques, 
including how to refine rough a priori bounds into better a 
posteriori bounds. 

We start by discussing the computation of majorant series 
for the canonical solutions of the differential equation. Then 
we describe how these are used to determine, on the one 
hand, at which precision each transition matrix should be 
computed for the final result to be correct, and on the other 
hand, where to truncate the series expansions to achieve 
this precision. Finally, we propose a way to limit the cost of 
computing bounds in "bit burst" phases. 

Remark 2. In practice, numerical errors that happen 
while computing the error bounds themselves are not always 
controlled, due to limited support for interval arithmetic in 
Maple. However, we have taken some care to ensure that 
crucial steps rely on rigorous methods. 

Majorant series. A formal power series g G is a 

majorant series of y £ <D[[z]], which we denote hy y < g, if 
\yn\ < Qn for all n. If y{z) < g{z) and y{z) < g{z), then 

y{z)<g{\z\), -±y(,)<-±g(z), ^^^^ 

y + y<g + g, yy<gg, yoy<gog. 

We shall allow majorant series to be formal logarithmic sums 
or matrices. The relation < extends in a natural way: we 
write X;„,fcJ/",fc-2:"log2'= < X^n.fc log iff \yn,k\ < 

gn.k for all n and k, and Y — (yij) < G — (gi.j) iff Y 
and G have the same format and yij <1 gij for all i,j. In 
particular, for scalar matrices, y < G if \yi,j\ < gi,j for all 
i,j. The inequalities (1141) still hold. 

Bounds on canonical solutions. The core of the error con- 
trol is a function that computes g{z) such that 

Vj, y[zo,j]izQ+z) < g{z) (15) 

(in the notations of ^SJ given Equation U]) and a point zo. 
This function is called at each step of analytic continuation. 

The algorithm is based on that of [2Z| ("SB" in what fol- 
lows), designed to compute "asymptotically tight" symbolic 
bounds. Run over Q{i) instead of (Q, SB returns (in the 
case of convergent series) a power series satisfying (|15|l of the 
form g{z) = X],T=o "•~^""<^('^)-^"' ^^^^^ ^ ^ " ^ Q, 
a.nd<P{n) = e°("'. The series g is specified by r, a, and other 
parameters of no interest here that define (p. The tightness 
property means that r and a are the best possible. 

However, the setting of numerical evaluation differs from 
that of symbolic bounds in several respects: (i) the issue is no 
more to obtain human-readable formulae, but bounds that 
are easy to evaluate; (ii) bound computations should be fast; 
(iii) at regular singular points, the fundamental solutions are 
logarithmic sums, not power series. For space reasons, we 
only sketch the differences between SB and the variant we 
use for error control. 

First, we take advantage of (i) to solve (ii). The most 
important change is that the parameter a is replaced by 



an approximation ct > a. This avoids computations with 
algebraic numbers, the bottleneck of SB. Strictly speaking, 
it also means that we are giving up the tightness of the 
bound. However, in constrast with the "plain" majorant se- 
ries method [33U34l[36] . we are free to take a arbitrarily close 
to a, since we do not use the ratio (a/d)" to mask subex- 
ponential factors. The algorithms from [27] adapt without 
difficulty. Specifically, Algorithms 3 and 4 are modified to 
work with a instead of a. Equality tests between "dominant 
roots" (Line 9 of Algorithm 3, Line 5 of Algorithm 4) can 
now be done numerically and are hence much less expensive. 
As a consequence, the parameter T from §3.3 is also replaced 
by an upper bound. This affects only the parameter (j>{n) of 
the bound, which is already pessimistic. 

The issue (iii) can be dealt with too. One may compute 

a series g such that y{z) < 9{^)'Y^k=o '"li ^ (with the no- 
tations of by modifying the choice of K in |27l §3.3] 

so that Y.t^M ali' qTx) )x=J\ - -^/"''' '^^'^ replacing [23 
Eq. (14)] by Eq. p3p above in the reasoning. 

Global error control. We now consider the issue of control- 
ling how the approximations at each step of analytic con- 
tinuation accumulate. Starting with a user-specified error 
bound e, we first set e' so that an e'-approximation f £ (Q(i) 
of the exact result r rounds to o(f) G UneN 10~"Z[i] with 
|o(f) — r\ < e. No other rounding error occur, since the rest 
of the computation is done on objects with coefficients in 
(Q(i). However, we have to choose the precision at which 
to compute each factor of the product H = RMm-i ■ ■ ■ Mo I 
(in evaldiffeq, and of similar products for other analytic 
continuation functions) so that |f — r| < e . 

As is usual for this kind of applications, we use the Frobe- 
nius norm, defined for any (not necessarily square) matrix by 
||(ai,i)||F = (Yli ■ I'^^i.jl^)^''^- The Frobenius norm satisfies 
|^_B||p < ||y4||F ||_B||f for any matrices A,B of compatible 
dimensions. If is a square r x r matrix, 

\\A\\o. < \\\A\\\2 < ||A||f <r|lA||^ (16) 

where |||-|||2 is the matrix norm induced by the Euclidean 
norm while ||-||oo denotes the entrywise uniform norm. Most 
importantly, the computation of ||j4||f is numerically stable, 
and if A has entries in <Q(i), it is easy to compute a good 
upper bound on ||A||f in fioating-point arithmetic. 

We bound the total error en on H using repeatedly the rule 
\\AB-AB\\f < \\A\\f\\B~B\\f + \\A-A\\f\\B\\f. From there, 
we compute precisions such that having — yl||F < ea 
for each factor ^ of H ensures that en < e' . Upper bounds on 
the norms ||^||f and ||j4||f appearing in the error estimate 
are computed either from an approximate value of A (usually 
A itself) if one is available at the time, or from a matrix G 
such that A<G given by (fTil[T3]) . 

Local error control. Let us turn to the computation of 
individual transition matrices. We restrict to the case of 
ordinary points. Given a precision e determined by the 
"global error control" step, our task is to choose A'' such that 
\\M — MzQ^zi ||f < e if M is computed by truncating the en- 
tries of ([SI to the order A''. If g satisfies p5|) . it suffices that 
~^o\) < 7-e for all i, as can be seen from (|14lll6p . We 
find a suitable N by dichotomic search. For the family of ma- 
jorant series g used in NumGfun, the g^^^X^) ^re not always 
easy to evaluate, so we bound them by expressions involving 



only elementary functions [27l §4.2]. The basic idea, related 
to the saddle-point method from asymptotic analysis, is that 
if a; > 0, inequalities like gn;{x) < {x/tn)'^g{t„){l — x/t„)~^ 
are asymptotically tight for suitably chosen t„. 

Points of large bit-size. Computing the majorants (fT^ is 
expensive when the point zq has large height. This can be 
fixed by working with an approximate value c of zq to ob- 
tain a bound valid in a neighborhood of c that contains zq. 
This technique is useful mainly in "bit burst" phases (where, 
additionally, we can reuse the same c at each step). 

Assume that "K[c](c + z) < G(z) for some c « zq. Since 
Y[zo]{zo + z) = Y[c]{c+{zo-c) + z)M-4zo by ©, it follows 
that y[zo](2o + z) < G{\zo - c\ + z) \\M~4zo\\l where 1 is 
a square matrix filled with ones. Now Mc-tc+z = Id-fO(z), 
whence Mc^c+z - Id < G{z) - G(0). For l^o - c| < 77, this 
implies WMc^za - Id|lF <S~ \\G{r]) - G(0)||f. Choosing 
77 small enough that 5 < 1, we get HM^^qIIf < (1 — ^)~^ ■ 
If G was computed from p5|l . i.e., for G(z) = (ji-g'-'^ (z)) . ., 
this finally gives the bound 

y[zo,j]{zo + z) < j^ J_gf ^'\v + z), 

valid for all zq in the disk \zo — c\ < rj. 

Note however that simple differential equations have solu- 
tions like exp{K/{l — z)) with large K. The condition 5 < 1 
then forces us to take c of size fl{K). Our strategy to deal 
with this issue in NumGfun is to estimate K using a point c 
of size 0(1) and then to choose a more precise c (as a last 
resort, c = zo) based on this value if necessary. 

Remark 3. If the evaluation point z is given as a program, 
a similar reasoning allows to choose automatically an approx- 
imation z G Q(i) of z such that \y{z) — y{z) \ is below a given 
error bound |32t §4.3]. In other applications, it is useful 
to have bounds on transition matrices that hold uniformly 
for all small enough steps in a given domain. Such bounds 
may be computed from a majorant differential equation with 
constant coefficients [351 §5]. 

6. REPEATED EVALUATIONS 

Drawing plots or computing integrals numerically requires 
to evaluate the same function at many points, often to a few 
digits of precision only. NumGfun provides limited support 
for this through the function dif f eqtoproc, which takes as 
input a D-finite function y, a target precision e, and a list 
of disks, and returns a Maple procedure that performs the 
numerical evaluation of y. For each disk D — {z, \z — c\ < p}, 
dif f eqtoproc computes a polynomial p £ (Q(i)[z] such that 
\o{p{z)) — 2/(2)1 < e for z £ Dn Q(i), where o again denotes 
rounding to complex decimal. The procedure returned by 
diff eqtoproc uses the precomputed p when possible, and 
falls back on calling evaldiffeq otherwise. 

The approximation polynomial p is constructed as a linear 
combination of truncated Taylor series of fundamental solu- 
tions y[c,j], with coefficients obtained by numerical analytic 
continuation from to c. The way we choose expansion or- 
ders is similar to the error control techniques of we first 
compute a bound -Bean on the fundamental solutions and 
their first derivatives on the disk D. The vector Y{c) of "ini- 
tial values" at c is computed to the precision e'/Bcan where 

< e/(2'")- We also compute _Bini > ||F(c)||f. Eachj/[c,j] is 



expanded to an order A'^ such that |[2/[c, j]jv; |[oo,d ini 5 

SO that finally \p{z) — y{z)\ < e for z £ D. 

The most important feature of diffeqtoproc is that it 
produces certified results. At low precisions and in the ab- 
sence of singularities, we expect that interval-based numer- 
ical solvers will perform better while still providing (a pos- 
teriori) guarantees. Also note that our simple choice of p 
is far from optimal. If approximations of smaller degree 
or height are required, a natural approach is to aim for a 
slightly smaller error \\y — p\\oo,D above, and then replace p 
by a polynomial p for which we can bound ||p — p||oo [Ml 
§6.2]. 

Example 6. The following plot of the function y defined by 
(z-1) y"'-z{2z-b) y"-{Az-G) y'+z^{z-l) y with the initial 
values j/(0) — 2, y'(0) = 1, y"{0) = was obtained using 
polynomial approximations on several disks that cover the 
domain of the plot while avoiding the pole z = 1. The whole 
computation takes about 9 s. Simple numerical integrators 
typically fail to evaluate y beyond z = 1. 




7. FINAL REMARKS 

Not all of NumGfun was described in this article. The 
symbolic bounds mentioned in ^ are also implemented, 
with functions that compute majorant series or other kinds 
of bounds on rational functions (bound_ratpoly), D-finite 
functions (bound_dif f eq and bound_dif f eq_tail) and P- 
recursive sequences (bound_rec and bound_rec_tail). This 
implementation was already presented in (27] . 

Current work focuses on adding support for evaluation "at 
regular singular points" (as outlined in ^4]), and improving 
performance. The development version of NumGfun already 
contains a second implementation of binary splitting, writ- 
ten in C and called from the Maple code. In the longer 
term, I plan to rewrite other parts of the package "from the 
bottom up", both for efficiency reasons and to make useful 
subroutines independant of Maple. 
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